Does constituent disambiguation facilitate compound interpretation?
نویسنده
چکیده
In English, nominal compounding is a very productive means of word formation and hence of lexical expansion. Recent additions to the Oxford English Dictionary online include e.g. bucket list – 'a list of things that a person hopes to experience or achieve before they die (or kick the bucket)' – and trout pout – 'unnaturally swollen lips resulting from the injection of excessive collagen into the lips in a cosmetic procedure intended to enhance their appearance'. As these examples demonstrate, although the compound meaning bears some relation to the meanings of the constituent nouns, the relation between them is unexpressed and must therefore be inferred by the hearer or reader. In some cases, the constituent nouns themselves seem to suggest a default reading; for example a compound like polo umpire might be expected to lead to context-free consensus about the meaning, because an umpire officiates in the playing of games, and polo is a game. However, in many cases, compound interpretation depends on other sources of information, including not only deixis and explicit explanation, but also context and world knowledge. This study explores the role of immediate linguistic context in disambiguating newly-coined compounds. All noun-noun strings occurring within a sentence were extracted from the prose fiction section of the British National Corpus. This section was chosen to reduce the risk of selecting compounds whose interpretation would rely on deixis, specialised technical knowledge or highly time-specific information as might be expected in e.g. newspapers. To create a set of hapaxes, the sample was then reduced to items that occurred only once in the whole corpus, and not at all in ukWaC, a much larger corpus of more than 2 billion words. A random selection of 80 hapaxes was then examined in their sentential context. This led to the hypothesis that, in the majority of cases, the compound could be disambiguated on the basis of the
منابع مشابه
Efficient sentence disambiguation by preferred constituent order
A major problem with (partially) free constituent order is to manifest preferences among structurally distinct parses of ambiguous sentences. In order to obtain scoring criteria a preferred constituent order can considerably support a bestrst strategy. This work presents an experimentally evaluated model of preferred German constituent order in the middle eld and its application for the impleme...
متن کاملBuilding Disambiguation System for Compound Noun Analysis Based on Lexical Conceptual Structure
In this paper, we propose a principled approach for disambiguating relations between constituent words of compound nouns whose heads are deverbal nouns, using the framework of lexical conceptual structure. The aim of this research is to reveal the complete set of lexical factor and disambiguation rules needed for application. The results of experiment for Japanese deverbal compounds and nominal...
متن کاملDisambiguating Compound Nouns for a Dynamic HPSG Treebank of Wall Street Journal Texts
The aim of this paper is twofold. We focus, on the one hand, on the task of dynamically annotating English compound nouns, and on the other hand we propose disambiguation methods and techniques which facilitate the annotation task. Both the aforementioned are part of a larger on-going effort which aims to create HPSG annotation for the texts from the Wall Street Journal (henceforward WSJ) secti...
متن کاملCombining resources for MWE-token classification
We study the task of automatically disambiguating word combinations such as jump the gun which are ambiguous between a literal and MWE interpretation, focusing on the utility of type-level features from an MWE lexicon for the disambiguation task. To this end we combine gold-standard idiomaticity of tokens in the OpenMWE corpus with MWE-type-level information drawn from the recently-published JD...
متن کاملThe Disambiguation of Nominalizations
This article addresses the interpretation of nominalizations, a particular class of compound nouns whose head noun is derived from a verb and whose modifier is interpreted as an argument of this verb. Any attempt to automatically interpret nominalizations needs to take into account: (a) the selectional constraints imposed by the nominalized compound head, (b) the fact that the relation of the m...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017